﻿<html DIR="LTR" xmlns:tool="http://www.microsoft.com/tooltip" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ddue="http://ddue.schemas.microsoft.com/authoring/2003/5" xmlns:MSHelp="http://msdn.microsoft.com/mshelp">
  <head>
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=utf-8" />
    <META NAME="save" CONTENT="history" />
    <title>Data Cleaning Package Sample</title>
    
 <Style TYPE="text/css">

body
{
    background: #FFFFFF;
    color: #000000;
    font-family:    Verdana;
    font-size: medium;
    font-style: normal;
    font-weight: normal;
    margin-top: 0;
    margin-bottom:  0;
    margin-left:    0;
    margin-right:   0;
    width:  100%;
}

div.#mainSection
{
    font-size: 70%;
    width: 100%;
    padding-left:    10;
    margin-right: 10;
}

div.#mainBody
{
    font-size: 90%;
    margin-top: 10;
    padding-bottom: 20;
}

div.#header
{
    background-color: #D2D2D2;
    padding-top:    0;
    padding-bottom: 0;
    padding-left:   10;
    padding-right:  0;
    width:          100%;
}

div.#header table
{
    border-bottom-color: #C8CDDE;
    border-bottom-style: solid;
    border-bottom-width: 1;
    width:  100%;
}

span.#runningHeaderText
{
    color: #003399;
    font-size: 90%;
}

span.#nsrTitle
{
/*    color: #003399;*/
    font-size: 120%;
    font-weight: 600;
}

div.#header table td
{
    color: #000000;
    font-size: 70%;
    margin-top: 0;
    margin-bottom:  0;
    padding-right: 20;
}

div.#header table tr.#headerTableRow3 td
{
    padding-bottom: 2;
    padding-top: 5;
}

div.#header table.#bottomTable
{
    border-top-color: #FFFFFF;
    border-top-style: solid;
    border-top-width: 1;
    text-align: left;
}

div.#footer
{
    font-size: 90%;
    margin-top: 0;
    margin-bottom:  0;
    margin-left:    -5;
    margin-right:   0;
    padding-top:    2;
    padding-bottom: 2;
    padding-left:   0;
    padding-right:  0;
    width:  100%;
}

hr.#footerHR
{
    border-bottom-color: #EEEEFF;
    border-bottom-style: solid;
    border-bottom-width: 1;
    border-top-color: C8CDDE;
    border-top-style: solid;
    border-top-width: 1;
    height: 3;
    color: #D2D2D2;
}

div.section
{
    padding-top:    2;
    padding-bottom: 2;
    padding-right:  15;
    width:  100%;
}

.heading
{
    color:          #000000;
    font-weight:    bold;
    margin-top:     18;
    margin-bottom:  8;
}

h1.heading
{
    color: #000000;
    font-size:  150%;
}

.subHeading
{
    color:          #000000;
    font-weight:    bold;
    font-size:      150%;
    margin-bottom:  4;
}

h2.subHeading
{
    color:          #000000;
    font-weight:    bold;
    font-size:      130%;
}
h3.subHeading
{
    color:  #000000;
    font-size: 125%;
    font-weight: bold;
}

h4.subHeading
{
    color: #000000;
    font-size: 110%;
    font-weight: bold;
}

h4.procedureHeading
{
    color: #000080;
    font-size: 110%;
    font-weight: bold;
}

h5.subHeading
{
    color: #000000;
    font-size: 100%;
    font-weight: bold;
}

img
{
    padding-bottom: 10;
}

img.toggle
{
    border: 0;
    margin-right: 5;
    padding-bottom: 10;
}

img.copyCodeImage
{
    border: 0;
    margin: 1;
    margin-right: 3;
    padding-bottom: 10;
}

img.downloadCodeImage
{
    border: 0;
    margin-right: 3;
    padding-bottom: 10;
}

img.viewCodeImage
{
    border: 0;
    margin-right: 3;
    padding-bottom: 10;
}

img.note
{
    border: 0;
    margin-right: 3;
    padding-bottom: 10;
}

img.#membersOptionsFilterImage
{
    border: 0;
    margin-left: 10;
    vertical-align: middle;
    padding-bottom: 10;
}

img.#toggleAllImage
{
    margin-left: 4;
    vertical-align: middle;
    padding-bottom: 10;
}

div.#mainSection table
{
    border: 0;
    font-size: 100%;
    width:  100%;
    margin-top: 5px;
    margin-bottom: 15px;
}

div.#mainSection table tr
{
    vertical-align: top;
}

div.#mainSection table th
{
    text-align: left;
    background: #D8D8D8;
    border-bottom-color: #D8D8D8;
    border-bottom-style: solid;
    border-bottom-width: 1;
    color: #000000;
    padding-left: 5;
    padding-right: 5;
}

div.#mainSection table td
{
    background: #F2F2F2;
    border-top-color: #D8D8D8;
    border-top-style: solid;
    border-top-width: 1;
    padding-left: 5;
    padding-right: 5;
}

div.#mainSection table td.imageCell
{
    white-space: nowrap;
}

div.code
{
	width: 98%;
}

div.code table
{
    border: 0;
    font-size: 95%;
    margin-bottom: 5;
    width: 100%
}

div.code table th
{   
    text-align: left;
    background: #D8D8D8;
    border-bottom-color: #D8D8D8;
    border-bottom-style: solid;
    border-bottom-width: 1;
    color: #000000;
    font-weight: bold;
    padding-left: 5;
    padding-right: 5;
}

div.code table td
{
    background: #CCCCCC;
    border-top-color: #D8D8D8;
    border-top-style: solid;
    border-top-width: 1;
    padding-left: 5;
    padding-right: 5;
    padding-top: 5;
}

div.alert
{
	margin-left: 10;
	width: 98%;
}

div.alert table
{
    border: 1;
    font-size: 100%;
    width:  100%;
    border: solid 1 #DEDFEF;
}

div.alert table th
{
    text-align: left;
    background: #D8D8D8;
    border-bottom-width: 0;
    color: #000000;
    padding-left: 5;
    padding-right: 5;
    border: solid 1 #DEDFEF;
}

div.alert table td
{
    background: #FFFFFF;
    border-top-color: #D8D8D8;
    border-top-style: solid;
    border-top-width: 1;
    padding-left: 5;
    padding-right: 5;
    border: solid 1 #DEDFEF;
}

span.copyCode
{
    color: #0000ff;
    font-size: 90%;
    font-weight: normal;
    cursor: hand;
    float: right;
    display: inline;
    text-align: right;
}

.downloadCode
{
    color: #0000ff;
    font-size: 90%;
    font-weight: normal;
    cursor: hand;
}

.viewCode
{
    color: #0000ff;
    font-size: 90%;
    font-weight: normal;
    cursor: hand;
}

div.code pre
{
    font-family:    Monospace, Courier New, Courier;
    font-size: 105%;
    color:  #000000;
}

code
{
    font-family:    Monospace, Courier New, Courier;
    font-size: 105%;
    color:  #000000;
}

dl
{
    margin-top: 0;
    padding-left:   1;
}

dd
{
    margin-bottom:  0;
    margin-left:    0;
    padding-left:   20;
}

dd p
{
    margin-top: 5;
}

ul
{
    margin-left: 17;
    list-style-type: disc;
}

ul ul
{
    margin-bottom: 4;
    margin-left: 17;
    margin-top: 3;
    list-style-type: disc;
}

ol
{
    margin-left: 24;
    list-style-type: decimal;
}

ol ol
{
    margin-left: 24;
    margin-top: 3;
    list-style-type: lower-alpha;
}

li
{
    margin-top: 0;
    margin-bottom: 0;
    padding-bottom: 0;
    padding-top: 0;
    margin-left: 5;
}

p
{
    margin-bottom: 15;
}

.tip
{
    color:  #0000FF;
    font-style: italic;
    cursor:hand;
    text-decoration:underline;
}

.math
{
    font-family: Times New Roman;
    font-size: 125%
}
.sourceCodeList
{
    font-family: Verdana;
    font-size: 90%; 
}

pre.viewCode
{
    width: 100%;
    overflow: auto;
}

li:hover table, li.over table
{
    background-color: #C0C0C0;
}

li:hover ul, li.over ul
{ 
    background-color: #d2d2d2;
    border: 1px solid #000;
    display: block;
}

</style>
  </head>
  <body>
    <!--Topic built:6/7/2007-->

    <div id="header">
      <table width="100%" id="topTable">
        <tr>
          <td align="left">
            <span id="nsrTitle">Data Cleaning Package Sample</span>
          </td>
          <td align="right">
            <span id="headfb" class="feedbackhead">
            </span>
          </td>
        </tr>
        <tr id="headerTableRow3">
          <td />
          <td align="right">
          </td>
        </tr>
      </table>
      </div>
    <div id="mainSection">
      <div id="mainBody">

        <font color="DarkGray">[This topic is pre-release documentation and is subject to change in future releases. Blank topics are included as placeholders.] </font><p /> 
        <span id="changeHistory">
        </span>
    <p>
      This sample works only with SQL Server 2005 and SQL Server 2008. It will not work with any version of SQL Server earlier than SQL Server 2005.
    </p>
    <p>The Data Cleaning sample is a package that cleans data. The package uses data that is a list of names and addresses that represent potential customers. The data requires cleaning; it contains spelling errors, is missing information, and includes customers already in the database, incorrect customers, or multiple subtly different instances of the same customer. </p>
    <p>The package control flow consists of two tasks. The first is an Execute SQL task that creates the input table, <b>CustomerLeads</b>, and creates the three output tables named <b>ExistingCustomerLeads</b>, <b>NewCustomerLeads</b>, and <b>DuplicateCustomerLeads</b>. The second task is a Data Flow data flow that executes the data flow that performs the cleaning of data extracted from the <b>CustomerLeads</b> table. The data flow identifies unique new, existing, and duplicate customers, and writes the rows of each customer type to the appropriate output table. </p>
    <p>If you run the sample on a non-English version of Windows, you may have to substitute the localized name of the Program Files folder to open or run the sample.</p>
    <div class="alert"><table width="100%" cellspacing="0" cellpadding="0"><tr><th align="left">Note: </th></tr><tr><td>
      This sample uses the Fuzzy Grouping and Fuzzy Lookup transformations, which are available only in the Enterprise version of SQL Server.<p></p>
    </td></tr></table><p></p></div>
    <div class="alert"><table width="100%" cellspacing="0" cellpadding="0"><tr><th align="left">Important: </th></tr><tr><td>
      Samples are provided for educational purposes only. They are not intended to be used in a production environment and have not been tested in a production environment. Microsoft does not provide technical support for these samples.<p></p>
    </td></tr></table><p></p></div>
    <p>To learn more about data cleaning, search for the following articles in the <a href="http://go.microsoft.com/fwlink/?LinkId=80475" alt=""><linkText xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">MSDN Library</linkText></a>:</p>
    <ul><li>
        Data Cleansing Applications with SQL Server Integration Services (Windows Media Video)<br></br>
      </li><li>
        Data Cleaning using the Fuzzy Grouping and Fuzzy Lookup Transformations (white paper)<br></br>
      </li></ul>
  <h1 class="heading">Requirements</h1><div id="requirementsSection" class="section">
    <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
      <p xmlns="">Running this sample package requires the following:</p>
      <ul xmlns=""><li>
          You must have installed and have administrative permissions on the <b>AdventureWorks</b> database. <br></br>
        </li><li>
          If you intend only to run the sample package from the command line, you must install Integration Services. <br></br>
        </li><li>
          If you intend to open the package in SSIS Designer and run the sample package, you must install Business Intelligence Development Studio. <br></br>
        </li></ul>
      <p xmlns="">For more information about how to install samples, see "Installing Sample Integration Services Packages" in SQL Server Books Online. </p>
    </content>
  </div><h1 class="heading">Location of the Sample Package</h1><div id="sectionSection0" class="section"><content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
      <p xmlns="">If the samples were installed to the default installation location, the Data Cleaning package is located in the following folder: </p>
      <p xmlns="">
        C:\Program Files\Microsoft SQL Server\100\Samples\Integration Services\Package Samples\DataCleaning Sample\Data Cleaning\.</p>
      <p xmlns="">The following files are required to run this sample package.</p>
      <h3 class="subHeading" xmlns=""></h3><table width="100%" cellspacing="0" cellpadding="0" border="1" style="background-color: #CCCCCC;" xmlns=""><tr>
            <th>
              File
            </th>
            <th>
              Description
            </th>
          </tr><tr>
          <td>
            <p>DataCleaning.dtsx</p>
          </td>
          <td>
            <p>The sample package.</p>
          </td>
        </tr><tr>
          <td>
            <p>CreateTables.sql</p>
          </td>
          <td>
            <p>SQL statements to create tables.</p>
          </td>
        </tr></table>
    </content></div><h1 class="heading">Adding Data Viewers to the Sample</h1><div id="sectionSection1" class="section"><content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
      <p xmlns="">To better understand how the Data Cleaning package works, you can add data viewers to the data flow and then view the data as it moves between data flow components. We recommend that you add data viewers to the following paths:</p>
      <ul xmlns=""><li>
          Path<b> </b>from<b> Union All</b> to <b>OLE DB Destination-Existing Customers</b><br></br>
        </li><li>
          Path<b> </b>from<b> Conditional Split on Canonical Record for Group</b> to <b>OLE DB Destination-Unique Customer Leads</b><br></br>
        </li><li>
          Path<b> </b>from<b> Conditional Split on Canonical Record for Group</b> to <b>OLE DB Destination-Duplicate Customer Leads</b><br></br>
        </li></ul>
      <h4 class="procedureHeading" xmlns="">To add data viewers</h4><div id="procedureSectionEBBJBHA" class="section" xmlns=""><ol><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">Right-click the path and then click <b>Data Viewers</b>.</p>
            </content>
          </li><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">In the Data Flow Path Editor, click <b>Add</b>.</p>
            </content>
          </li><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">In the <b>Configure Data Viewer </b>dialog box, click <b>Grid</b> in the type list. By default, all columns display in the data viewer. </p>
            </content>
          </li><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">Repeat steps 1-3 for other paths.</p>
            </content>
          </li></ol></div>
    </content></div><h1 class="heading">Running the Sample</h1><div id="sectionSection2" class="section"><content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
      <p xmlns="">The package can be run from the command line by using the <b>dtexec</b> utility, or can be run in Business Intelligence Development Studio.</p>
      <p xmlns="">If you are using a non-English version of Windows, you may have to update the <b>ConnectionString</b> property of any file connection managers used in the package to run the sample package successfully. Verify that the path used in the connection manager is valid on your computer, and if required, modify the path so that it uses the localized name of the Program Files folder.</p>
      <p xmlns="">For this sample, you may have to update "Program Files" in the <b>ConnectionString</b> property for the CreateTables.sql connection manager.</p>
      <h4 class="procedureHeading" xmlns="">To run the package by using dtexec</h4><div id="procedureSectionEHBHBHA" class="section" xmlns=""><ol><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">Open a Command Prompt window.</p>
            </content>
          </li><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">Change the directory to C:\Program Files\Microsoft SQL Server\100\DTS\Binn, the location of <b>dtexec</b>.</p>
            </content>
          </li><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">Type the following command:</p>
              <div class="code" xmlns=""><span codeLanguage="other"><table width="100%" cellspacing="0" cellpadding="0"><tr><th align="left"></th></tr><tr><td colspan="2"><pre>dtexec /f "C:\Program Files\Microsoft SQL Server\100\Samples\Integration Services\Package Samples\Data Cleaning Sample\DataCleaning\DataCleaning.dtsx"</pre></td></tr></table></span></div>
            </content>
          </li><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">Press <b>Enter</b>.</p>
            </content>
          </li></ol></div>
      <p xmlns="">For more information about how to run the package by using the <b>dtexec</b> utility, see the topic, "dtexec Utility", in SQL Server Books Online. </p>
      <h4 class="procedureHeading" xmlns="">To run the package in Business Intelligence Development Studio</h4><div id="procedureSectionEDBHBHA" class="section" xmlns=""><ol><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">Open Business Intelligence Development Studio.</p>
            </content>
          </li><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">On the <b>File</b> menu, point to <b>Open</b>, and then click <b>Project/Solution</b>.</p>
            </content>
          </li><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">Locate the <b>DataCleaning Sample</b> folder, and then double-click the file named DataCleaning.sln.</p>
            </content>
          </li><li>
            <content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
              <p xmlns="">In <b>Solution Explorer</b>, right-click DataCleaning.dtsx in the <b>SSIS Packages</b> folder, and then click <b>Execute Package</b>. </p>
            </content>
          </li></ol></div>
      <div class="alert" xmlns=""><table width="100%" cellspacing="0" cellpadding="0"><tr><th align="left">Note: </th></tr><tr><td>
        If you open the package in SSIS Designer and view the package properties, you will notice that the <b>DelayValidation</b> property is set to <b>True</b>. Validation of the package must be delayed because some tables used by the Data Cleaning sample package—the <b>CustomerLeads</b>, and the three output tables named <b>ExistingCustomerLeads</b>, <b>NewCustomerLeads</b>, and <b>DuplicateCustomerLeads</b>—are not created until the first time the package runs. If <b>DelayValidation</b> is set to <b>False</b>, a validation error occurs when you open the package in SSIS Designer before running the package.<p></p>
      </td></tr></table><p></p></div>
    </content></div><h1 class="heading">Components in Sample</h1><div id="sectionSection3" class="section"><content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
      <p xmlns="">The following table lists the tasks, containers, data sources and destinations, and transformations that are used within the sample.</p>
      <h3 class="subHeading" xmlns=""></h3><table width="100%" cellspacing="0" cellpadding="0" border="1" style="background-color: #CCCCCC;" xmlns=""><tr>
            <th>
              Element
            </th>
            <th>
              Purpose
            </th>
          </tr><tr>
          <td>
            <p>Execute SQL task</p>
          </td>
          <td>
            <p>The Execute SQL task is named <b>Create Customer Address Reference Table View, Populate NewCustomer Input Table and Create Output Tables</b>. This task creates the input table, <b>CustomerLeads</b>, and also creates the three output tables named <b>ExistingCustomerLeads</b>, <b>NewCustomerLeads</b>, and <b>DuplicateCustomerLeads</b>. </p>
          </td>
        </tr><tr>
          <td>
            <p>Data Flow task</p>
          </td>
          <td>
            <p>The Data Flow task, <b>Fuzzy Lookup Data Flow Task</b>, executes the data flow in the package.</p>
          </td>
        </tr><tr>
          <td>
            <p>OLE DB source </p>
          </td>
          <td>
            <p>The OLE DB source, <b>OLE DB Source - Customer Leads</b>, reads records from the <b>CustomerLeads</b> table.</p>
          </td>
        </tr><tr>
          <td>
            <p>Lookup transformation</p>
          </td>
          <td>
            <p>The Lookup transformation, <b>Lookup against Existing Customers</b>, performs an exact lookup to identify existing customers. If the lookup succeeds, the record is inserted into the <b>ExistingCustomerLeads</b> table.</p>
          </td>
        </tr><tr>
          <td>
            <p>Derived Column transformation</p>
          </td>
          <td>
            <p>The Derived Column transformation, <b>Derived Column</b>, adds the <b>_Similarity</b> columns to each row and sets the column value to 1.</p>
          </td>
        </tr><tr>
          <td>
            <p>Fuzzy Lookup transformation</p>
          </td>
          <td>
            <p>The Fuzzy Lookup transformation, <b>Fuzzy Lookup against Existing Customers</b>, performs a fuzzy lookup to identify customer records that are fuzzy matches of existing customer records. </p>
            <p>The transformation adds a <b>_Similarity</b> column that contains a similarity score to each row. The score 0.0 means no match was found, whereas 1.0 means an exact match was found. A score between 0.0 and 1.0 is a measure of similarity in which a value closer to 1.0 indicates greater similarity. </p>
          </td>
        </tr><tr>
          <td>
            <p>Conditional Split transformation</p>
          </td>
          <td>
            <p>The first Conditional Split transformation, <b>Conditional</b> <b>Split on _Similarity</b>, directs input rows to one of two outputs depending on the value of the similarity score determined by the fuzzy lookup. Rows with a similarity score &gt;= .70 are written to the <b>ExistingCustomerLeads</b> table. Rows with similarity scores &lt; 70 are probably valid new customer leads and additional cleaning is done on these rows. </p>
            <p>The second Conditional Split transformation, <b>Conditional Split on Canonical Record for Group</b>, directs input rows to one of two outputs depending on whether the data row is a duplicate. If the values of the <b>_key_in</b> and <b>_key_out</b> columns are equal, the row is used as the canonical row in the group, and the canonical row is inserted into the <b>NewCustomerLeads</b> table. If the <b>_key_in</b> and <b>_key_out</b> columns are not equal, the row is treated as a fuzzy duplicate and the row is inserted into the <b>DuplicateCustomerLeads</b> table.</p>
          </td>
        </tr><tr>
          <td>
            <p>Union All transformation</p>
          </td>
          <td>
            <p>The Union All transformation, <b>Union All</b>, merges rows of existing customers—both exact and fuzzy matches—into one dataset.</p>
          </td>
        </tr><tr>
          <td>
            <p>Fuzzy Grouping transformation</p>
          </td>
          <td>
            <p>The Fuzzy Grouping transformation, <b>Fuzzy Grouping</b>, groups customers who are likely duplicates. The transformation adds three columns <b>_key_in</b>, <b>_key_out</b> and <b>_score</b> to each row. <b>_key_in</b> is a unique identifier assigned to each input row and <b>_key_out</b> contains the particular <b>_key_in</b> assigned to the row that best represents all the rows in a fuzzy group. All rows in a fuzzy group will have the same <b>_key_out</b> value. The <b>_score</b> column is a value between 0.0 and 1.0 that describes the textual similarity between a given input row and the row selected to be the canonical value. </p>
          </td>
        </tr><tr>
          <td>
            <p>OLE DB destinations</p>
          </td>
          <td>
            <p>The OLE DB destination, <b>OLE DB Destination - Existing Customers</b>, inserts rows into the <b>ExistingCustomerLeads</b> table.</p>
            <p>The OLE DB destination, <b>OLE DB Destination - Unique Customer Leads</b>, inserts rows into the <b>NewCustomerLeads</b> table.</p>
            <p>The OLE DB destination, <b>OLE DB Destination - Duplicate Customer Leads</b>, inserts rows into the <b>DuplicateCustomerLeads</b> table.</p>
          </td>
        </tr><tr>
          <td>
            <p>File connection manager</p>
          </td>
          <td>
            <p>The File connection manager, <b>CreateTables.sql</b>, connects to the file that contains the SQL the package uses.</p>
          </td>
        </tr><tr>
          <td>
            <p>OLE DB connection manager</p>
          </td>
          <td>
            <p>The OLE DB connection manager, <b>(local).AdventureWorks</b>, connects to the <b>AdventureWorks</b> database on the local server.</p>
          </td>
        </tr></table>
      <p xmlns="">The following table describes the data in the output tables.</p>
      <h3 class="subHeading" xmlns=""></h3><table width="100%" cellspacing="0" cellpadding="0" border="1" style="background-color: #CCCCCC;" xmlns=""><tr>
            <th>
              Table
            </th>
            <th>
              Description
            </th>
          </tr><tr>
          <td>
            <p>
              <b>ExistingCustomerLeads</b>
            </p>
          </td>
          <td>
            <p>Contains records that exactly match an existing customer, and records that fuzzily match an existing customer with very high textual similarity. </p>
          </td>
        </tr><tr>
          <td>
            <p>
              <b>NewCustomerLeads</b>
            </p>
          </td>
          <td>
            <p>Contains records for which there was no good match to an existing customer. If the list contained multiple instances of the same name, or a highly similar version of a particular name, only one record will be directed to <b>NewCustomerLeads,</b> and the duplicates will be directed to <b>DuplicateCustomerLeads</b>.</p>
          </td>
        </tr><tr>
          <td>
            <p>
              <b>DuplicateCustomerLeads</b>
            </p>
          </td>
          <td>
            <p>Contains duplicates of new customers.</p>
          </td>
        </tr></table>
    </content></div><h1 class="heading">Sample Results</h1><div id="sectionSection4" class="section"><content xmlns="http://ddue.schemas.microsoft.com/authoring/2003/5">
      <p xmlns="">To see the execution results of the Data Cleaning sample package, run the following Transact-SQL query:</p>
      <div class="code" xmlns=""><span codeLanguage="other"><table width="100%" cellspacing="0" cellpadding="0"><tr><th align="left"></th></tr><tr><td colspan="2"><pre>Select * from AdventureWorks.FuzzyLookupExample.ExistingCustomerLeads
Select * from AdventureWorks.FuzzyLookupExample.NewCustomerLeads
Select * from AdventureWorks.FuzzyLookupExample.DuplicateCustomerLeads
</pre></td></tr></table></span></div>
    </content></div><!--[if gte IE 5]>
			<tool:tip element="seeAlsoToolTip" avoidmouse="false"/><tool:tip element="languageFilterToolTip" avoidmouse="false"/><tool:tip element="roleInfoSpan" avoidmouse="false"/>
		<![endif]--></div>
      <div id="footer">
        
			
			© 2007 Microsoft Corporation. All rights reserved.
		</a>
 	
      </div>
    </div>
  </body>
</html>